Loading [MathJax]/jax/output/HTML-CSS/jax.js
In [ ]:
import numpy as np
import scipy.linalg as sla
import numpy.linalg as la
import random
import matplotlib.pyplot as plt
%matplotlib inline

Obtaining eigenvalues and eigenvectors numerically

We want to prepare a matrix with deliberately chosen eigenvalues. Let's use diagonalization to write the matrix A:

A=UDU1

where we set D to be a known matrix with the pre-defined eigenvalues:

D = np.diag([lambda1, lambda2, ..., lambdan])

We need to generate a matrix U that has an inverse. Orthogonal matrices are a great option here, since U1=UT. We use QR decomposition to get an orthogonal matrix (you don't need to understand this method).

In [ ]:
n = 4
X = np.random.rand(n,n)
U,_ = sla.qr(X)

D = np.diag([6,2,4,7])

Now we can use diagonalization to write A

In [ ]:
A = U@D@U.T

And we can check that the eigenvalues are indeed what we expected:

In [ ]:
eigl, eigv = la.eig(A)
print(eigl)
print(eigv)

We want to find the eigenvector corresponding to the largest eigenvalue in magnitude. For that, we can use np.argsort, which returns the indices that sort the array in ascending order. Hence, we are interested in the last entry.

In [ ]:
eig_index_sort = np.argsort(abs(eigl))
print(eig_index_sort)
eigpos = eig_index_sort[-1]

Recall that eigenvectors are stored as columns! Hence this would be the eigenvector corresponding to the largest (in magnitude) eigenvalue.

In [ ]:
eigv[:,eigpos]

Let's also pick an initial vector:

In [ ]:
x0 = np.random.randn(n)
x0

Power Iteration

Power iteration should converge to a multiple of the eigenvector u1 corresponding to largest eigenvalue (in magnitude).

xk=(λ1)k[α1u1+α2(λ2λ1)ku2+...]

Let's implememt power iteration. We simply perform multiple matrix vector multiplications using a for loop:

In [ ]:
x = x0
for i in range(40):
    x = A @ x
    print(x)
    
print('Exact eigenvalue = ',eigv[:,eigpos])
  • What's the problem with this method?
  • Does anything useful come of this?
  • How do we fix it?

We can get the corresponding eigenvalue

In [ ]:
np.dot(x,A@x)/np.dot(x,x)

Normalized power iteration

Back to the beginning: Reset to the initial vector and normalize

In [ ]:
x = x0/la.norm(x0)

Implement normalized power iteration. We will start with 10 iterations, and see what happens...

In [ ]:
x = x0/la.norm(x0)

for i in range(10):
    x = A @ x
    nrm = la.norm(x)
    x = x/nrm
    print(x)

print('exact = ' ,eigv[:,eigpos])

print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))

What if the starting guess does not have any component of u1, i.e., if α1=0?

xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]

In theory (or infinite precision calculations), if α1=0, power iteration will converge to a vector that is a multiple of the eigenvector u2.

In practice, it is unlikely that a random vector x0 will not have any component of u1. In the chances that happens, finite operations during the iterative process will usually introduce such component.

Creating a matrix where the dominant eigenvalue is negative

In [ ]:
n = 4
    
D = np.diag([-5,2,4,3])

A = U@D@U.T

eigl, eigv = la.eig(A)

eig_index_sort = np.argsort(abs(eigl))
eigpos = eig_index_sort[-1]

print(eigl)
print(eigv[:,eigpos])
In [ ]:
x = x0/la.norm(x0)

for i in range(40):
    x = A @ x
    nrm = la.norm(x)
    x = x/nrm
    print(x)

print('exact = ' ,eigv[:,eigpos])

print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))

print(D)

What is happening here? Note that the scalar that multiplies the eigenvector u1 in

xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]

is (λ1)k, and hence if the eigenvalue λ1 is negative, the solution of power iteration will converge to the eigenvector, but with alternating signs, i.e., u1 and u1.

When dealing with dominant eigenvalues of multiplicity greater than 1: |λ1|=|λ2| and λ1,λ2>0

In [ ]:
n = 4

D = np.diag([5,5,2,1])

A = U@D@U.T

eigl, eigv = la.eig(A)

print(eigl)
print(eigv[:,2])
print(eigv[:,3])
In [ ]:
x = x0/la.norm(x0)

for i in range(40):
    x = A @ x
    nrm = la.norm(x)
    x = x/nrm
    print(x)

print('u1_exact = ' ,eigv[:,2])
print('u2_exact = ' ,eigv[:,3])

print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))

print(D)

In general, power method converges to:

xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]

However if |λ1|=|λ2| and λ1,λ2>0, we get:

xk=(λ1)k(α1u1+α2u2)+[...]

and hence the solution of power iteration will converge to a multiple of the linear combination of the eigenvectors u1 and u2.

When dealing with dominant eigenvalues of multiplicity greater than 1: |λ1|=|λ2| and λ1,λ2<0

In [ ]:
n = 4

D = np.diag([-5,-5,2,1])

A = U@D@U.T

eigl, eigv = la.eig(A)

print(eigl)
print(eigv[:,2])
print(eigv[:,3])
In [ ]:
x = x0/la.norm(x0)

for i in range(40):
    x = A @ x
    nrm = la.norm(x)
    x = x/nrm
    print(x)

print('u1_exact = ' ,eigv[:,2])
print('u2_exact = ' ,eigv[:,3])

print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))

print(D)

In general, power method converges to:

xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]

However if |λ1|=|λ2| and λ1,λ2<0, we get:

xk=±|λ1|k(α1u1+α2u2)+[...]

and hence the solution of power iteration will converge to a multiple of the linear combination of the eigenvectors u1 and u2, but the signs will flip at each step of the iterative method.

When dealing with dominant eigenvalues of multiplicity greater than 1: |λ1|=|λ2| and λ1,λ2 have opposite signs

In [ ]:
n = 4

D = np.diag([-5,5,2,1])

A = U@D@U.T

eigl, eigv = la.eig(A)

print(eigl)
print(eigv[:,0])
print(eigv[:,1])
In [ ]:
x = x0/la.norm(x0)

for i in range(40):
    x = A @ x
    nrm = la.norm(x)
    x = x/nrm
    print(x)

print('u1_exact = ' ,eigv[:,2])
print('u2_exact = ' ,eigv[:,3])

print('eig_approx = ', np.dot(x,A@x)/np.dot(x,x))

print(D)

In general, power method converges to:

xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]

However if |λ1|=|λ2|, λ1,λ2 have opposite signs, we get:

xk=±|λ1|k(α1u1±α2u2)+[...]

and hence power iteration does not converge to one solution. Indeed, the method oscilates between two linear combination of eigenvectors, and fails to give the correct eigenvalue.

Summary - Pitfalls of power iteration:

  • Risk of eventual overflow. Use normalized power iteration to avoid this.
  • If the initial guess has α1=0, the method will converge to multiple of eigenvector u2 if infinite precision computation is used. In practice (in finite precision computations), this will not be an issue, and the method will converge to multiple of eigenvector u1.
  • If the two largest eigenvalues are equal in magnitude, power iteration will converge to a vector that is a linear combination of the corresponding eigenvectors (or fail to converge). This is a real problem that cannot be discounted in practice. Other methods should be used in this case.

Estimating the eigenvalue

We want to approximate the eigenvalue u1 using the solution of power iteration

xk=(λ1)kα1u1+(λ1)k(λ2λ1)kα2u2+(λ1)k[(λ3λ1)kα3u3+...]

xk approaches a multiple of the eigenvector u1 as k, hence

xk=(λ1)kα1u1

but also

xk+1=(λ1)k+1α1u1xk+1=λ1xk

We can then approximate λ1 as the ratio of corresponding entries of the vectors xk+1 and xk, i.e.,

λ1(xk+1)j(xk)j

Error of Power Iteration

We define the approximated eigenvector as

uapprox=xk(λ1)kα1

and hence the error becomes the part of the power iteration solution that was neglected, i.e.,

e=uapproxu1=(λ2λ1)kα2α1u2+[(λ3λ1)kα3α1u3+...]

and when k is large, we can write (again, we are assuming that |λ1|>|λ2||λ3||λ4|...

ek(λ2λ1)kα2α1u2

And when we take the norm of the error

||ek|||λ2λ1|k|α2α1|||u2||||ek||=O(|λ2λ1|k)

Convergence of Power Iteration

We want to see what happens to the error from one iteration of the other of power iteration

||ek+1||||ek||=|λ2λ1|k+1|α2α1||λ2λ1|k|α2α1|=λ2λ1

Or in other words, we can say that the error decreases by a constant value, given as λ2λ1, at each iteration.

Power method has LINEAR convergence!

||ek+1||=λ2λ1||ek||
or we can also write ||ek+1||=(λ2λ1)k||e0||

Simple Example:

Suppose you are given a matrix with eigenvalues:

[3,4,5]

You use normalized power iteration to approximate one of the eigenvectors, which is given as x, and we assume ||x||=1.

You knew the norm of the error of the initial guess was given as

||e0||=||xx0||=0.3

How big will be the error after three rounds of power iteration? (Since all vectors have norm 1, the absolute and relative error are the same)

||e3||=|45|3||e0||=0.3|45|3=0.1536

Convergence plots

In [ ]:
n=4

lambda_array_ordered = [7, 3, -2, 1]

X = np.random.rand(n,n)
U,_ = sla.qr(X)
D = np.diag(lambda_array_ordered)
A = U@D@U.T
eigl, eigv = la.eig(A)

eig_index_sort = np.argsort(abs(eigl))
eigpos = eig_index_sort[-1]
u1_exact = eigv[:,eigpos]

print('Largest lambda = ', lambda_array_ordered[0])
print('Eigenvector = ', u1_exact)
print('Convergence rate = ', lambda_array_ordered[1]/lambda_array_ordered[0])
In [ ]:
# Generate normalized initial guess
x0 = np.random.random(n)
x = x0/la.norm(x0)

count = 0
diff  = 1
eigs  = [x[0]]
error = [np.abs( eigs[-1]  - lambda_array_ordered[0] )]

# We will use as stopping criteria the change in the
# approximation for the eigenvalue

while (diff > 1e-6 and count < 100):
    count += 1
    xnew = A@x #xk+1 = A xk
    eigs.append(xnew[0]/x[0])
    x = xnew/la.norm(xnew)    
    diff  = np.abs( eigs[-1]  - eigs[-2] )
    error.append( np.abs( eigs[-1]  - lambda_array_ordered[0] ) )    
    print("% 10f % 2e % 2f" %(eigs[-1], error[-1], error[-1]/error[-2])) 
In [ ]:
plt.semilogy(np.abs(error)) 

Inverse Power iteration

What if we are interested in the smaller eigenvalue in magnitude?

Suppose x,λ is an eigenpair of A, such that Ax=λx. What would be an eigenvalue of A1?

A1Ax=A1λx

Ix=λA1x

1λx=A1x

Hence 1λ is an eigenvalue of A1 .

If we want to find the smallest eigenvalue in magnitude of A, we can perform power iteration using the matrix A1 to find ˉλ=1λ, where ˉλ is the largest eigenvalue of A1.

Let's implement that:

In [ ]:
n = 4
D = np.diag([5,-1,2,7])

A = U@D@U.T
eigl, eigv = la.eig(A)

eig_index_sort = np.argsort(abs(eigl))
eig_index_sort
eigpos = eig_index_sort[0]

print(eigv[:,eigpos])
In [ ]:
x0 = np.random.random(n)
nrm = la.norm(x0)
x = x0/nrm

for i in range(20):
    x = la.inv(A)@x
    x = x/la.norm(x)

print("lambdan = ",x.T@A@x/(x.T@x))
print("un = ", x) 

Can you find ways to improve the code snippet above?

In [ ]:
#Inverse Power iteration to get smallest eigenvalue
x0 = np.random.random(n)
nrm = la.norm(x0)
x = x0/nrm
P, L, Um = sla.lu(A)
for k in range(20):
    y = sla.solve_triangular(L, np.dot(P.T, x), lower=True)
    x = sla.solve_triangular(Um, y)
    x = x/la.norm(x)

print("lambdan = ",x.T@A@x/(x.T@x))
print("un = ", x)  

Inverse Shifted Power Iteration

What if we want to find another eigenvalue that is not the largest or the smallest?

Suppose x,λ is an eigenpair of A, such that Ax=λx. We want to find the eigenvalues of the shifted inverse matrix (AσI)1

(AσI)1x=ˉλx

Ix=ˉλ(AσI)x=ˉλ(λIσI)x

ˉλ=1λσ

We could write the above eigenvalue problem as

B1x=ˉλx

which can be solved by inverse power iteration, which will converge to the eigenvalue 1λσ

In [ ]:
n = 4
D = np.diag([5,-7,2,10])

A = U@D@U.T
eigl, eigv = la.eig(A)

print(eigl)
eigv
In [ ]:
#Shifted Inverse Power iteration 
sigma = 1

x0 = np.random.random(n)
nrm = la.norm(x0)
x = x0/nrm
B = A - sigma*np.eye(n)
P, L, Um = sla.lu(B)
for k in range(20):
    y = sla.solve_triangular(L, np.dot(P.T, x), lower=True)
    x = sla.solve_triangular(Um, y)
    x = x/la.norm(x)

print("lambdan = ",x.T@A@x/(x.T@x))
print("un = ", x)  

Computational cost and convergence

  • power iteration: to obtain largest eigenvalue in magnitude λ1
    • Matrix-vector multiplications at each iteration: O(n2)
    • convergence rate: |λ2λ1|
  • inverse power iteration: to obtain smallest eigenvalue in magnitude λn
    • only one factorization: O(n3)
    • backward-forward substitutions to solve at each iteration: O(n2)
    • convergence rate: |λnλn1|
  • inverse shifted power iteration: to obtain an eigenvalue close to a known/given value σ
    • only one factorization: O(n3)
    • backward-forward substitutions to solve at each iteration: O(n2)
    • convergence rate: |λcσλc2σ|
      where λc is the closest eigenvalue to σ and λc2 is the second closest eigenvalue to σ.